22 research outputs found
Context-aware Neural Machine Translation for English-Japanese Business Scene Dialogues
Despite the remarkable advancements in machine translation, the current
sentence-level paradigm faces challenges when dealing with highly-contextual
languages like Japanese. In this paper, we explore how context-awareness can
improve the performance of the current Neural Machine Translation (NMT) models
for English-Japanese business dialogues translation, and what kind of context
provides meaningful information to improve translation. As business dialogue
involves complex discourse phenomena but offers scarce training resources, we
adapted a pretrained mBART model, finetuning on multi-sentence dialogue data,
which allows us to experiment with different contexts. We investigate the
impact of larger context sizes and propose novel context tokens encoding
extra-sentential information, such as speaker turn and scene type. We make use
of Conditional Cross-Mutual Information (CXMI) to explore how much of the
context the model uses and generalise CXMI to study the impact of the
extra-sentential context. Overall, we find that models leverage both preceding
sentences and extra-sentential context (with CXMI increasing with context size)
and we provide a more focused analysis on honorifics translation. Regarding
translation quality, increased source-side context paired with scene and
speaker information improves the model performance compared to previous work
and our context-agnostic baselines, measured in BLEU and COMET metrics.Comment: MT Summit 2023, research track, link to paper in proceedings:
https://aclanthology.org/2023.mtsummit-research.23
BLEU Meets COMET: Combining Lexical and Neural Metrics Towards Robust Machine Translation Evaluation
Although neural-based machine translation evaluation metrics, such as COMET
or BLEURT, have achieved strong correlations with human judgements, they are
sometimes unreliable in detecting certain phenomena that can be considered as
critical errors, such as deviations in entities and numbers. In contrast,
traditional evaluation metrics, such as BLEU or chrF, which measure lexical or
character overlap between translation hypotheses and human references, have
lower correlations with human judgements but are sensitive to such deviations.
In this paper, we investigate several ways of combining the two approaches in
order to increase robustness of state-of-the-art evaluation methods to
translations with critical errors. We show that by using additional information
during training, such as sentence-level features and word-level tags, the
trained metrics improve their capability to penalize translations with specific
troublesome phenomena, which leads to gains in correlation with human judgments
and on recent challenge sets on several language pairs.Comment: Accepted at EAMT 202
Learning Disentangled Representations of Negation and Uncertainty
Negation and uncertainty modeling are long-standing tasks in natural language
processing. Linguistic theory postulates that expressions of negation and
uncertainty are semantically independent from each other and the content they
modify. However, previous works on representation learning do not explicitly
model this independence. We therefore attempt to disentangle the
representations of negation, uncertainty, and content using a Variational
Autoencoder. We find that simply supervising the latent representations results
in good disentanglement, but auxiliary objectives based on adversarial learning
and mutual information minimization can provide additional disentanglement
gains.Comment: Accepted to ACL 2022. 18 pages, 7 figures. Code and data are
available at https://github.com/jvasilakes/disentanglement-va
Non-Exchangeable Conformal Risk Control
Split conformal prediction has recently sparked great interest due to its
ability to provide formally guaranteed uncertainty sets or intervals for
predictions made by black-box neural models, ensuring a predefined probability
of containing the actual ground truth. While the original formulation assumes
data exchangeability, some extensions handle non-exchangeable data, which is
often the case in many real-world scenarios. In parallel, some progress has
been made in conformal methods that provide statistical guarantees for a
broader range of objectives, such as bounding the best -score or
minimizing the false negative rate in expectation. In this paper, we leverage
and extend these two lines of work by proposing non-exchangeable conformal risk
control, which allows controlling the expected value of any monotone loss
function when the data is not exchangeable. Our framework is flexible, makes
very few assumptions, and allows weighting the data based on its relevance for
a given test example; a careful choice of weights may result on tighter bounds,
making our framework useful in the presence of change points, time series, or
other forms of distribution drift. Experiments with both synthetic and real
world data show the usefulness of our method.Comment: ICLR 202
Uncertainty in Natural Language Generation: From Theory to Applications
Recent advances of powerful Language Models have allowed Natural Language
Generation (NLG) to emerge as an important technology that can not only perform
traditional tasks like summarisation or translation, but also serve as a
natural language interface to a variety of applications. As such, it is crucial
that NLG systems are trustworthy and reliable, for example by indicating when
they are likely to be wrong; and supporting multiple views, backgrounds and
writing styles -- reflecting diverse human sub-populations. In this paper, we
argue that a principled treatment of uncertainty can assist in creating systems
and evaluation protocols better aligned with these goals. We first present the
fundamental theory, frameworks and vocabulary required to represent
uncertainty. We then characterise the main sources of uncertainty in NLG from a
linguistic perspective, and propose a two-dimensional taxonomy that is more
informative and faithful than the popular aleatoric/epistemic dichotomy.
Finally, we move from theory to applications and highlight exciting research
directions that exploit uncertainty to power decoding, controllable generation,
self-assessment, selective answering, active learning and more
Findings of the WMT 2021 shared task on quality estimation
© (2021) The Authors. Published by ACL. This is an open access article available under a Creative Commons licence.
The published version can be accessed at the following link on the publisher’s website: http://www.statmt.org/wmt21/pdf/2021.wmt-1.71.pdfWe report the results of the WMT 2021 shared task on Quality Estimation, where the challenge is to predict the quality of the output of neural machine translation systems at the word and sentence levels. This edition focused on two main novel additions: (i) prediction for unseen languages, i.e. zero-shot settings, and (ii) prediction of sentences with catastrophic errors. In addition, new data was released for a number of languages, especially post-edited data. Participating teams from 19 institutions
submitted altogether 1263 systems to different task variants and language pairs
MLQE-PE: A multilingual quality estimation and post-editing dataset
© 2020 The Authors. For reuse permissions, please contact the Authors.We present MLQE-PE, a new dataset for Machine Translation (MT) Quality Estimation (QE) and Automatic Post-Editing (APE). The dataset contains eleven language pairs, with human labels for up to 10,000 translations per language pair in the following formats: sentence-level direct assessments and post-editing effort, and word-level good/bad labels. It also contains the post-edited sentences, as well as titles of the articles where the sentences were extracted from, and the neural MT models used to translate the text
Construction of a biodiversity knowledge repository using a text mining-based framework
Abstract In our aim to make the information encapsulated by biodiversity literature more accessible and searchable, we have developed a text mining-based framework for automatically transforming text into a structured knowledge repository. A text mining workflow employing information extraction techniques, i.e., named entity recognition and relation extraction, was implemented in the Argo platform and was subsequently applied on biodiversity literature to extract structured information. The resulting annotations were stored in a repository following the emerging Open Annotation standard, thus promoting interoperability with external applications. Accessible as a SPARQL endpoint, the repository supports knowledge discovery over a huge amount of biodiversity literature by retrieving annotations matching user-specified queries